Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases

Identifieur interne : 000818 ( Main/Exploration ); précédent : 000817; suivant : 000819

A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases

Auteurs : Jungkap Park ; Gus R. Rosania ; Kazuhiro Saitou

Source :

RBID : PMC:2907084

Abstract

We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision based tool for extracting structure diagrams in research articles and converting them into connection tables, a virtual “Chemical Expert” system for screening the converted structures based on the adjustable levels of estimated conversion accuracy, and a fragment-based measure for calculating intermolecular similarity. For annotation, calculated chemical similarity between the converted structures and entries in a virtual small molecule database is used to establish the links. The overall annotation performances can be tuned by adjusting the cutoff threshold of the estimated conversion accuracy. We performed an annotation test which attempts to link 121 journal articles registered in the PubMed to entries in the PubChem which is the largest, publicly accessible chemical database. Two cases of tests are performed and their results are compared to see how the overall annotation performances are affected by the different threshold levels of the estimated accuracy of the converted structure. Our work demonstrates that over 45% of articles could have true positive links to entries in the PubChem database with promising recall and precision rates in both tests. Furthermore, we illustrates that Chemical Expert system which can screen the converted structures based on the adjustable levels of estimated conversion accuracy is a key factor impacting the overall annotation performance. We propose that this machine vision based strategy can be incorporated with the text-mining approach to facilitate extraction of contextual scientific knowledge about a chemical structure, from the scientific literature.


Url:
DOI: 10.1021/ci900029v
PubMed: 19621901
PubMed Central: 2907084


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases</title>
<author>
<name sortKey="Park, Jungkap" sort="Park, Jungkap" uniqKey="Park J" first="Jungkap" last="Park">Jungkap Park</name>
<affiliation>
<nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,
<email>jungkap@umich.edu</email>
,
<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rosania, Gus R" sort="Rosania, Gus R" uniqKey="Rosania G" first="Gus R." last="Rosania">Gus R. Rosania</name>
<affiliation>
<nlm:aff id="A2"> Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109,
<email>kazu@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Saitou, Kazuhiro" sort="Saitou, Kazuhiro" uniqKey="Saitou K" first="Kazuhiro" last="Saitou">Kazuhiro Saitou</name>
<affiliation>
<nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,
<email>jungkap@umich.edu</email>
,
<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">19621901</idno>
<idno type="pmc">2907084</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2907084</idno>
<idno type="RBID">PMC:2907084</idno>
<idno type="doi">10.1021/ci900029v</idno>
<date when="2009">2009</date>
<idno type="wicri:Area/Pmc/Corpus">000089</idno>
<idno type="wicri:Area/Pmc/Curation">000089</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000164</idno>
<idno type="wicri:Area/Ncbi/Merge">000073</idno>
<idno type="wicri:Area/Ncbi/Curation">000073</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000073</idno>
<idno type="wicri:doubleKey">1549-9596:2009:Park J:a:tunable:machine</idno>
<idno type="wicri:Area/Main/Merge">000826</idno>
<idno type="wicri:Area/Main/Curation">000818</idno>
<idno type="wicri:Area/Main/Exploration">000818</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases</title>
<author>
<name sortKey="Park, Jungkap" sort="Park, Jungkap" uniqKey="Park J" first="Jungkap" last="Park">Jungkap Park</name>
<affiliation>
<nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,
<email>jungkap@umich.edu</email>
,
<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Rosania, Gus R" sort="Rosania, Gus R" uniqKey="Rosania G" first="Gus R." last="Rosania">Gus R. Rosania</name>
<affiliation>
<nlm:aff id="A2"> Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109,
<email>kazu@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
<author>
<name sortKey="Saitou, Kazuhiro" sort="Saitou, Kazuhiro" uniqKey="Saitou K" first="Kazuhiro" last="Saitou">Kazuhiro Saitou</name>
<affiliation>
<nlm:aff id="A1"> Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109,
<email>jungkap@umich.edu</email>
,
<email>grosania@umich.edu</email>
</nlm:aff>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Journal of chemical information and modeling</title>
<idno type="ISSN">1549-9596</idno>
<idno type="eISSN">1549-960X</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p id="P1">We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision based tool for extracting structure diagrams in research articles and converting them into connection tables, a virtual “Chemical Expert” system for screening the converted structures based on the adjustable levels of estimated conversion accuracy, and a fragment-based measure for calculating intermolecular similarity. For annotation, calculated chemical similarity between the converted structures and entries in a virtual small molecule database is used to establish the links. The overall annotation performances can be tuned by adjusting the cutoff threshold of the estimated conversion accuracy. We performed an annotation test which attempts to link 121 journal articles registered in the PubMed to entries in the PubChem which is the largest, publicly accessible chemical database. Two cases of tests are performed and their results are compared to see how the overall annotation performances are affected by the different threshold levels of the estimated accuracy of the converted structure. Our work demonstrates that over 45% of articles could have true positive links to entries in the PubChem database with promising recall and precision rates in both tests. Furthermore, we illustrates that Chemical Expert system which can screen the converted structures based on the adjustable levels of estimated conversion accuracy is a key factor impacting the overall annotation performance. We propose that this machine vision based strategy can be incorporated with the text-mining approach to facilitate extraction of contextual scientific knowledge about a chemical structure, from the scientific literature.</p>
</div>
</front>
</TEI>
<affiliations>
<list></list>
<tree>
<noCountry>
<name sortKey="Park, Jungkap" sort="Park, Jungkap" uniqKey="Park J" first="Jungkap" last="Park">Jungkap Park</name>
<name sortKey="Rosania, Gus R" sort="Rosania, Gus R" uniqKey="Rosania G" first="Gus R." last="Rosania">Gus R. Rosania</name>
<name sortKey="Saitou, Kazuhiro" sort="Saitou, Kazuhiro" uniqKey="Saitou K" first="Kazuhiro" last="Saitou">Kazuhiro Saitou</name>
</noCountry>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000818 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000818 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     PMC:2907084
   |texte=   A Tunable Machine Vision-based Strategy for Automated Annotation of Chemical Databases
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i   -Sk "pubmed:19621901" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd   \
       | NlmPubMed2Wicri -a OcrV1 

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024